HR-Depth: High Resolution Self-Supervised Monocular Depth Estimation

نویسندگان

چکیده

Self-supervised learning shows great potential in monocular depth estimation, using image sequences as the only source of supervision. Although people try to use high-resolution for accuracy prediction has not been significantly improved. In this work, we find core reason comes from inaccurate estimation large gradient regions, making bilinear interpolation error gradually disappear resolution increases. To obtain more accurate it is necessary features with spatial and semantic information. Therefore, present an improved DepthNet, HR-Depth, two effective strategies: (1) re-design skip-connection DepthNet get better (2) propose feature fusion Squeeze-and-Excitation(fSE) module fuse efficiently. Using Resnet-18 encoder, HR-Depth surpasses all previous state-of-the-art(SoTA) methods least parameters at both high low resolution. Moreover, SoTA are based on fairly complex deep networks a mass which limits their real applications. Thus also construct lightweight network uses MobileNetV3 encoder. Experiments show that can perform par many models like Monodepth2 only20%parameters. All codes will be available https://github.com/shawLyu/HR-Depth.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-Supervised Monocular Image Depth Learning and Confidence Estimation

Convolutional Neural Networks (CNNs) need large amounts of data with ground truth annotation, which is a challenging problem that has limited the development and fast deployment of CNNs for many computer vision tasks. We propose a novel framework for depth estimation from monocular images with corresponding confidence in a selfsupervised manner. A fully differential patch-based cost function is...

متن کامل

Depth Estimation Using Monocular and Stereo Cues

Depth estimation in computer vision and robotics is most commonly done via stereo vision (stereopsis), in which images from two cameras are used to triangulate and estimate distances. However, there are also numerous monocular visual cues— such as texture variations and gradients, defocus, color/haze, etc.—that have heretofore been little exploited in such systems. Some of these cues apply even...

متن کامل

Bayesian depth estimation from monocular natural images.

Estimating an accurate and naturalistic dense depth map from a single monocular photographic image is a difficult problem. Nevertheless, human observers have little difficulty understanding the depth structure implied by photographs. Two-dimensional (2D) images of the real-world environment contain significant statistical information regarding the three-dimensional (3D) structure of the world t...

متن کامل

Aperture Supervision for Monocular Depth Estimation

We present a novel method to train machine learning algorithms to estimate scene depths from a single image, by using the information provided by a camera’s aperture as supervision. Prior works use a depth sensor’s outputs or images of the same scene from alternate viewpoints as supervision, while our method instead uses images from the same viewpoint taken with a varying camera aperture. To en...

متن کامل

Qualitative Estimation of Depth in Monocular Vision

In this paper we propose two techniques to qualitatively estimate distance in monocular vision. Two kinds of approaches are described, the former based on texture analysis and the latter on histogram inspection. Although both the methods allow only to determine whether a point within an image is nearer or farther than another with respect to the observer, they can be usefully exploited in all t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i3.16329